Higher Order Spectral Phase Features for Speaker Identification

نویسندگان

Vinod Chandran

Sridha Sridharan

چکیده

This paper investigates the use of higher order spectra (HOS) phase features in the task of speaker identification. Within the speech processing community, short time spectral phase information is widely regarded as unimportant for speaker recognition. Features are generally defined from the magnitude spectrum only. This paper utilises features that contain both magnitude and phase spectral information. These HOS phase features are derived by integrating points along a straight line in bifrequency space. Initial experiments used unconstrained, microphone speech from a 20 male speaker database to construct Gaussian mixture models (GMM) for each speaker. The HOS phase features achieve a correct identification rate of 98.5%, which is similar to the rate achieved by the MFCC feature set (99.4%). Other experiments were conducted on the larger YOHO database of 138 speakers. Average correct identification rates of above 95% were achieved for varying populations sizes up to the full 138 speakers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The effectiveness of higher order spectral phase features in speaker identification

This paper studies the effectiveness of higher order spectra (HOS) phase features in the task of speaker identification. Within the speech processing community, short time spectral phase information is generally regarded as unimportant for speaker recognition. In fact, the most commonly used features for speaker recognition are the Mel frequency cepstral coefficients (MFCC), which are defined f...

متن کامل

On the use of perceptual Line Spectral pairs Frequencies and higher-order residual moments for Speaker Identification

Conventional Speaker Identification (SI) systems utilise spectral features like Mel-Frequency Cepstral Coefficients (MFCC) or Perceptual Linear Prediction (PLP) as a frontend module. Line Spectral pairs Frequencies (LSF) are popular alternative representation of Linear Prediction Coefficients (LPC). In this paper, an investigation is carried out to extract LSF from perceptually modified speech....

متن کامل

Features for speaker and language identification

Abstract In this paper we examine several features derived from the speech signal for the purpose of identification of speaker or language from the speech signal. Most of the current systems for speaker and language identification use spectral features from short segments of speech. There are additional features which can be derived from the residual of the speech signal, which correspond to th...

متن کامل

Improving Performance of Speaker Identification System Using Complementary Information Fusion

Feature extraction plays an important role as a front-end processing block in speaker identification (SI) process. Most of the SI systems utilize like Mel-Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), Linear Predictive Cepstral Coefficients (LPCC), as a feature for representing speech signal. Their derivations are based on short term processing of speech signal and...

متن کامل

Feature Level Compensation for Robust Speaker Identification in Mismatched Conditions

In this paper, robust front end features are proposed for improvement in speaker identification (SI) performance by considering the factors of real world situations, like mismatch between training and testing conditions. The most commonly used MFCC features are very much sensitive to effects such as channel and environment mismatch. Characteristics of speech gets changed with room acoustics, ch...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Higher Order Spectral Phase Features for Speaker Identification

نویسندگان

چکیده

منابع مشابه

The effectiveness of higher order spectral phase features in speaker identification

On the use of perceptual Line Spectral pairs Frequencies and higher-order residual moments for Speaker Identification

Features for speaker and language identification

Improving Performance of Speaker Identification System Using Complementary Information Fusion

Feature Level Compensation for Robust Speaker Identification in Mismatched Conditions

عنوان ژورنال:

اشتراک گذاری